feat(seer): Wire explorer chat write site through SeerRun outbox#115231
feat(seer): Wire explorer chat write site through SeerRun outbox#115231trevor-e wants to merge 25 commits into
Conversation
Register an outbox category and cell receiver to handle SeerRun creation via the hybrid cloud outbox system.
…flag Rename idempotency_key to external_idempotency_key in the SeerRun outbox receiver to match the field name Seer's request models expect. Register the organizations:seer-run-mirror feature flag for future write-site gating.
Wrap response.json() in the SeerRun outbox receiver in a JSONDecodeError guard. A 2xx response with a malformed body would otherwise raise uncaught and trap the outbox row in indefinite retry. Treat it as terminal, matching how 4xx is handled. Also declare external_idempotency_key on AgentChatRequest and SearchAgentStartRequest and cast the receiver bodies to those TypedDicts so the call signatures type-check.
The previous fix imported JSONDecodeError via sentry.utils.json (which re-exports simplejson). urllib3 BaseHTTPResponse.json() raises stdlib json.JSONDecodeError, an unrelated class. The except clause never matched, so a 2xx with malformed body would still propagate uncaught and trap the outbox row in indefinite retry. Switch to stdlib json with a noqa for the S003 rule.
Match the immediate-neighbor convention (.get() + try/except DoesNotExist) for the missing-row check, and place handle_seer_run_create at the end of cell.py so newest receivers append rather than prepend.
Match against SeerRunType(run.type) instead of the raw str field, then call assert_never on the default branch. mypy now flags any new SeerRunType variant that does not have a case in handle_seer_run_create at compile time.
Behind the organizations:seer-run-mirror flag, the search-agent endpoint now creates a SeerRun + CellOutbox row in a transaction. The receiver fires on commit (flush=True), makes the HTTPS call to Seer with run.uuid as external_idempotency_key, and fills in seer_run_state_id. Synchronous flush preserves the existing endpoint contract: the endpoint still returns run_id from the response. The other write sites (start_run, autofix, PR review, replay) follow in their own PRs.
Push the seer-run-mirror flag check inside send_search_agent_start_request so the endpoint has a single call site that returns run_id directly. Eliminates the parallel start_search_agent_via_outbox helper and the flag-aware branching in the endpoint, and inlines the body construction that previously lived in _build_search_agent_body.
…al handling Wrap payload extraction and SeerRunType parsing in a single try/except so malformed outbox rows mark the run FAILED instead of crashing the receiver and stalling the queue. Extract a small _mark_seer_run_failed helper shared by the three terminal-failure sites (invalid payload, 4xx response, 2xx with malformed JSON body).
The previous refactor that folded the flag dispatch into send_search_agent_start_request dropped the search_agent.missing_run_id error log along with its organization/project/response_data context. Restore it inside send_search_agent_start_request before the SeerApiError is raised.
Add a proper create_seer_run factory method to Factories/Fixtures so SeerRun test instances use the standard test helper pattern. Remove test_passes_idempotency_key which tested an implementation detail (single-line dict merge) already covered by the happy-path tests. Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
… guard Two review fixes in the SeerRun outbox receiver: - PR_REVIEW raised NotImplementedError, which the outbox treats as transient and retries forever. Mark the run FAILED and return instead until PR_REVIEW dispatch is wired. - The idempotency early-return used truthiness on seer_run_state_id, which would re-issue the Seer request for the (legal) value 0. Compare against None explicitly.
A structurally valid 2xx Seer response that lacks a run_id field won't self-heal on retry — same terminal class as the malformed-JSON case immediately above. Mark the run FAILED and return instead of raising RuntimeError, which the outbox treats as transient and would retry indefinitely.
dict(viewer_context or {}) coerced None into an empty dict, which the receiver would then forward to _resolve_viewer_context as a non-None ViewerContext with null fields — triggering JWT signing instead of being skipped. Preserve None when the caller passes None so the downstream skip path stays intact for future write sites.
urllib3's BaseHTTPResponse.json() raises UnicodeDecodeError for non-UTF-8 bodies in addition to json.JSONDecodeError. Both are terminal: a non-UTF-8 binary body from a misbehaving proxy won't self-heal on retry. Catch both in the same except clause.
Remove the two terminal-case comments that just restated the function flow, and move _mark_seer_run_failed below handle_seer_run_create per the public-then-private convention.
Caller saw an opaque OutboxFlushError when the synchronous drain failed; the endpoint's existing SeerApiError handler is a better fit. Same translation pattern token_exchange/{manual_,}refresher.py uses for the same reason. The async outbox retry will heal the mirror state separately.
When the synchronous drain fails, the SeerRun row is already committed and the async outbox retry will eventually heal it. Surface the run uuid as a retry_token in the 500 response so future frontend logic can resume that same run instead of creating a duplicate via a fresh idempotency key. No client changes today; the field is forward-compatible.
If response.json() returns a list/scalar/null instead of an object, data.get('run_id') would raise AttributeError and stall the outbox shard on retries. Add an isinstance check inside the existing try so non-dict bodies route through the same invalid_json_body terminal path as undecodable bodies.
Add an optional `run_type` parameter to `SeerAgentClient.start_run()`. When provided and the `organizations:seer-run-mirror` flag is on, the method creates a SeerRun + CellOutbox entry instead of calling Seer directly. The explorer chat endpoint now passes `run_type=SeerRunType.EXPLORER`. Other callers (trigger_autofix_agent, etc.) omit it, keeping the direct path unchanged. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.
| "body": dict(chat_body), | ||
| "viewer_context": dict(self.viewer_context), | ||
| }, | ||
| ).save() |
There was a problem hiding this comment.
Missing OutboxFlushError handling in outbox path
High Severity
The outbox context with flush=True can raise OutboxFlushError if the signal receiver fails (e.g., Seer returns a 500, triggering a RuntimeError in handle_seer_run_create). This exception is not caught here, unlike the analogous code in search_agent_start.py which wraps the block in a try/except OutboxFlushError. Since OutboxFlushError does not inherit from SeerApiError, the endpoint's exception handler in organization_seer_agent_chat.py won't catch it either, resulting in an unhandled crash instead of a graceful error response.
Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.
| run.refresh_from_db() | ||
| if run.mirror_status != SeerRunMirrorStatus.LIVE or run.seer_run_state_id is None: | ||
| raise SeerApiError("Seer run mirror failed to materialize", 500) | ||
| return run.seer_run_state_id |
There was a problem hiding this comment.
Outbox path skips explorer index triggering
Medium Severity
The outbox path returns early at line 380, completely bypassing _maybe_trigger_explorer_index_for_new_run. Since the explorer chat endpoint is the only caller passing run_type, all explorer runs routed through the outbox will never trigger project indexing or context-engine indexing, even when Seer's response indicates missing indexes. The response data containing has_explorer_index and has_org_project_context is consumed only in the receiver, which discards those fields.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.


Summary
run_typeparam toSeerAgentClient.start_run()— when provided andorganizations:seer-run-mirroris on, routes through the outboxrun_type=SeerRunType.EXPLORERtrigger_autofix_agent, etc.) omit it, keeping the direct path unchangedStacked on #115111. Third write site conversion (after assisted-query and legacy autofix).
Test plan
test_outbox_path_creates_run_and_returns_run_idverifies SeerRun creation, outbox dispatch, and run_id return via the clienttest_post_new_conversation_calls_clientupdated to expectrun_type=SeerRunType.EXPLORER